An evolutionary differential game for regulating the role of monoclonal antibodies in treating signalling pathways in oesophageal cancer

This work presents a new framework for a competitive evolutionary game between monoclonal antibodies and signalling pathways in oesophageal cancer. The framework is based on a novel dynamical model that takes into account the dynamic progression of signalling pathways, resistance mechanisms and monoclonal antibody therapies. This game involves a scenario in which signalling pathways and monoclonal antibodies are the players competing against each other, where monoclonal antibodies use Brentuximab and Pembrolizumab dosages as strategies to counter the evolutionary resistance strategy implemented by the signalling pathways. Their interactions are described by the dynamical model, which serves as the game’s playground. The analysis and computation of two game-theoretic strategies, Stackelberg and Nash equilibria, are conducted within this framework to ascertain the most favourable outcome for the patient. By comparing Stackelberg equilibria with Nash equilibria, numerical experiments show that the Stackelberg equilibria are superior for treating signalling pathways and are critical for the success of monoclonal antibodies in improving oesophageal cancer patient outcomes.


Introduction
Oesophageal cancer (OC) ranks as the sixth most prevalent cause of cancer-related mortality on a global scale [1].The signalling pathways within OC, encompassing monovalent ligands like epidermal growth factor (EGF) receptor and the complex receptor formed by the combination of the epidermal growth factor receptor (EGFR) and EGF, are of utmost importance in regulating cellular viability, proliferation and differentiation [2].Genetic mutations frequently cause dysregulation of cancer cell signalling pathways, which in turn causes cancer cells to become resistant to treatment [3].Several potential points of inhibition can be found along the signalling pathways, they have the potential to specifically target receptors located on the cellular membrane [4].Monoclonal antibodies (mAbs), also known as immunotherapy, are the most effective inhibitors for OC signalling pathways in preventing resistance development.Monoclonal antibodies demonstrate a remarkable level of specificity, meaning that each antibody binds exclusively to a single target [5].Pembrolizumab is a monoclonal antibody that specifically targets the checkpoint protein PD-1 on the surface of T cells, a type of immune cell.The mechanism of action involves the inhibition of the interaction between the checkpoint protein PD-L1, located on the surface of tumour cells, and its associated signalling pathways.This mechanism enables T to attack and eliminate cancerous cells [6].Moreover, Pembrolizumab also has the bonus of stimulating immune checkpoints, which play a crucial role in modulating the immune response.Until they are required, T cells are normally 'off', or inactive, owing to immune checkpoints.As a result, the T cells are suppressed in their attempts to harm the healthy tissues.The goal of developing and using mAbs has been to increase the efficiency of therapeutic agent delivery to tumour sites.Brentuximab is a monoclonal antibody that has been linked to chemotherapy agents.It works by blocking signalling pathways and transporting a chemotherapeutic agent, preventing cancer cells from spreading and multiplying [7].Understanding the Darwinian mechanisms that drive the evolutionary dynamics of signalling pathways is proving to be a promising avenue for developing new approaches to treating this disease [8][9][10][11][12].Using evolutionary game theory (EGT) to model the evolution of treatment resistance in signalling pathways is crucial for reaching this objective.
The study of biological interactions between players is a fruitful application of the theoretical framework of EGT [13].In an evolutionary game, each participant represents a distinct species or population.EGT examines the dynamics between species that use varying tactics and/or characteristics.These organisms do not need to act rationally, by contrast to classical game theory, because their strategies are inherited rather than deliberately selected.These strategies possess the potential to enhance an organism's fitness, which is a measure of its ability to survive and proliferate.Consequently, individuals employing these strategies are more inclined to eventually attain population dominance [14].In an evolutionary game, a player's success or failure depends on how well they can strategically respond to their opponent's actions.Differential games are problems within the field of EGT that focus on modelling and studying conflicts that arise in a dynamic framework [15].The process of organismal evolution over time can be more accurately described by dynamical models, which are typically modelled using a system of differential equations.
Various deterministic models have been employed to simulate the dynamics of signalling pathways in diverse cancer types.The mathematical model proposed by Itano et al. [16] uses ordinary differential equations (ODEs) to investigate the dimerization mechanism underlying the development of Gefitinib resistance in lung cancer.Bianconi et al. [17] employ an ODE-based model to examine the correlation between the expressions of EGFR and IGF1R proteins in non-small-cell lung cancer.Cross-talk between the oestrogen receptor and the EGFR is described using a mathematical model introduced in [18].A computational model presented in [19] simulates biochemical and metabolic interactions observed in melanoma cancer between the PI3K/AKT and MAPK pathways.In [20], the authors investigate how AKT pathways contribute to therapy resistance in receptor tyrosine kinase (RTK) signalling in colon cancer.In a very recent work [21], the authors propose an optimal control framework to determine the best treatment strategies for controlling aberrant RTK signalling pathways in EC patients.However, all the aforementioned models do not adequately describe the evolutionary dynamics of treatment-resistant signalling pathways in OC.
Some recent clinical trials have shown that treatment protocols based on evolutionary principles lead to better clinical outcomes.In [22,23], it has been shown that bipolar androgen therapy anticipates the development of resistance to androgen deprivation therapy (ADT) in advanced prostate cancer.By strategically administering androgen, it aims to restore sensitivity to ADT.In another clinical trial, it was demonstrated that even though small-cell lung cancers might develop resistance to immunotherapy, at the same time, they exhibit increased response to cytotoxicity [24].These clinical trials demonstrate the feasibility of an evolutionary game-theoretic framework in improving clinical outcomes of cancer patients.
Motivated by the aforementioned evolutionary frameworks in other cancers and the fact that one of the primary reasons for failure of clinical trials in OC is attributed to drug resistance by cancer [25], our work focuses on an evolutionary game-theoretic framework modelling interactions of treatments and resistance in OC to improve clinical outcomes.In this context, the interaction between mAbs and signalling pathways in the context of treating OC can be analogized to a differential game [26].The signalling pathways exhibit evolutionary modifications and exhibit adaptive responses to the treatment administered by mAbs, using various mechanisms to evade the intended therapeutic effects of the medication.The game's design confers a notable advantage upon the monoclonal antibodies.The signalling pathways exhibit a limited capacity to anticipate or adapt to therapeutic interventions that have not yet been administered.However, mAbs exhibit their capacity to predict the subsequent advancement of the signalling pathways.As a result, the game demonstrates a notable imbalance in power [27].Therefore, mAbs initiate the first action by delivering treatment, while the signalling pathways subsequently respond by developing resistance to it.In other words, the signalling pathways' ability to implement adaptive strategies is inactive until the administration of a particular treatment.Within this particular context, mAbs can be regarded as assuming a leadership role, while the signalling pathways can be seen as taking on a follower position.Therefore, the treatment of the signalling pathways can be classified as a Stackelberg game.
Nevertheless, the existing treatment protocols for signalling pathways, such as the continuous administration of the maximum tolerated dose (MTD), fail to effectively exploit the advantage or disparity in the game [11].In the context of signalling pathways therapies, repeated utilization of a consistent approach significantly increases the probability of the signalling pathways developing resistance towards the treatment.In this scenario, mAbs cannot see the signalling pathways move because the game is played simultaneously by all players.As a result, mAbs hands over the reins of leadership to the signalling pathways when they show signs of making progress.In the given context, the concept of Nash equilibrium or Nash solution emerges, wherein both the mAbs and signalling pathways cannot independently alter their strategies in a manner that would result in a favourable outcome for either party [15].Currently, there is no established and all-encompassing framework for an evolutionary differential game in the context of signalling pathways in OC.The goal of this research is to examine a class of game-theoretic formulations that can be used to determine the optimal course of treatment for a patient with OC.
This paper is organized as follows: §2 introduces an ODE model that aims to provide a comprehensive understanding of the evolutionary process of signalling pathway resistance.In §3, we show the theoretical formulation of an evolutionary differential game, which includes the analysis of both Stackelberg and Nash equilibria.Section 4 is devoted to the analysis of the existence of Nash and Stackelberg equilibria.The numerical schemes that are proposed to resolve Nash and Stackelberg equilibria are shown in §5.In §6, numerical simulations are presented to support our analytical results.Finally, in §7, conclusions are presented.

An ordinary differential equation model for signalling pathways in oesophageal cancer
The presented model elucidates the mechanisms by which immunotherapies modulate signalling pathways.It explains how T cells can be directed and effectively administer chemotherapy to destroy signalling pathways through mAbs strategies.The model also predicts an expansion of the signalling pathways owing to their evolutionary resistance towards immunotherapies.Our model is constructed based on the law of mass action, with additional insights from Reed et al. [28].We first define -L(t )-the density of EGF ligand (no./volume) -C(t )-the density of EGF:EGFR complex (no./volume) -T(t )-the concentration of T cells per litre of blood (cells l −1 ) -M(t )-the concentration of chemotherapy per litre of blood (mg l −1 ) u b (t )-the dosage of Brentuximab per litre of blood (mg l −1 ) u p (t )-the dosage of Pembrolizumab per litre of blood (mg l −1 ) u c (t )-the evolutionary resistance strategy of the signalling pathways (no./volume).
The governing equations of a mathematical model are given as follows: royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240347 where H 0 and R 0 are the initial conditions for the epidermal growth factor receptor HER2 and EGFR, respectively [29], and α = R 0 − 1 2 H 0 .The following are descriptions of the terms used in (2.1): -γL-the exponential EGF ligand growth, -1 2 a 1 αLC-the rate of change of L is made up of a gain rate proportional to LC, -a 2 C-the rate of change of L is made up of a gain rate proportional to C, The other parameters are described as follows: -γ-the growth rate of monovalent ligands (EGF) (day −1 ), -a 1 -the gain rate proportional to LC (cell), -a 2 -the gain rate proportional to C complex (cell), -a 3 -the rate of L death caused by T cells (l 2 cells -1 mg -1 ), -b 1 -the gain rate proportional to RL (cell), -b 2 -the loss rate proportional to HC (cell), -b 3 -the gain rate proportional to EGF : EGFR : HER2 complex (cell), -b 4 -the rate of C death caused by specific drug (l mg -1 ), -k-the rate of the natural resistance that may be present before drug exposure, -b -the rate of the benefit the cell gains by reducing sensitivity to the drug, -ω-the rate of circulating T cells (cell), -d 1 -the rate of T cells death owing to signalling pathways (cell −1 day −1 ), -d 2 -the rate of the amount of Pembrolizumab injected (mg l −1 ), -d 3 -the rate of chemotherapy drug decay (day −1 ), -d 4 -the rate of the amount of Brentuximab injected (mg l −1 ).
It is essential to use the following non-dimensionalized variables to non-dimensionalize the above ODE system to improve the numerical algorithms' stability, (2.2) L = q 1 L ^, C = q 2 C ^, T = q 3 T ^, M = q 4 M t = q 5 t ^, u p = q 6 u ^p, u c = q 7 u ^c, u b = q 8 u ^b and the corresponding parameters are In this context, the scaling weights q i , i = 1, …, 8 serve the purpose of non-dimensionalizing the parameters and model variables, as well as ensuring that they possess comparable ranges.The system has been transformed into a non-dimensionalized form, which is expressed as (2.4) Then, the ODE system in (2.4) can be expressed as In §3, we use this model to create two evolutionary differential games involving mAbs and signalling pathways.

An evolutionary differential game
A differential game is said to be complete in the context of EGT if all players are fully aware of one another's strategy spaces and cost functionals [30].In this framework, we consider a situation in which the two players, each driven by their self-interest, have no desire to work together.We build a complete evolutionary differential game in which the mAbs and the signalling pathways are the game's players, denoted by A and S, respectively.In (3.1), Ω 1 and Ω 2 represent the spaces of admissible strategies for A, while Ω 3 represents the space of admissible strategies for S. (3.1) We observe that Ω 1 , Ω 2 and Ω 3 are closed and convex.In (3.1), D 1 and D 2 represent the MTDs of Pembrolizumab and Brentuximab, respectively, which can be administered to achieve optimal outcomes in eradicating the signalling pathways.If the MTDs for a given patient are exceeded, there is a risk to the patient's health [31].When cancer cells can divide and pass on their genetic mutations, a strong selective pressure is generated, pushing the cells to become resistant to treatment.The development of resistance would be stymied if this factor were not present [32].Consequently, we set the upper limit of evolutionary resistance to be D 3 .The model presented in (2.5) is a playground setting in which A and S employ their strategies (3.1) to surpass one another.The period of the game's development is represented by the interval [0, T f ].
royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240347 The primary objective of mAbs is the eradication of cancerous cells.To achieve this, they employ strategies that effectively limit the number of potential signalling pathways.Furthermore, it effectively mitigates the adverse consequences induced by T cells.Therefore, A endeavours to minimize its own objective functional, namely Five terms are stated in (3.2).In the first and second terms, maximum tumour burden G is shown to be correlated with signalling pathways abundance, and maximum chemotherapy dose Z is shown to be correlated with Brentuximab, where τ is the chemotherapy toxicity rate.The third term represents how Pembrolizumab regulates T cells, where r represents the Pembrolizumab's toxicity.Regularization priors for Brentuximab and Pembrolizumab costs are given by the fourth and fifth terms in J A , with μ, η ≥ 0 denoting the corresponding regularization weight.If the disease exhibits stabilization or reduction in size without complete eradication, the administration of treatment will persist as long as it remains well-tolerated and the dissemination of signalling pathways is effectively contained.In the event of cancer progression, the administration of treatment will be discontinued.Therefore, it can be inferred that G, L or C possess greater values compared with Z, u b , u p or T, where τ ≥ 1, and r is a positive parameter.Hence, J A is bounded below by 0. If such an outcome fails to materialize, the game will reach its conclusion.
In relation to the primary objective of S in this game, it is to evade death by developing resistance to the treatment being used.So S aims to minimize his own objective functional, namely There are four different possible terms in (3.3).The first three terms, which together determine how to fit a solution of J S and incorporate the strategies into the evolutionary process, present the fitness of signalling pathways [33].The fourth term in J S is the regularization term that represents the cost of the signalling pathways resistance, and ν ≥ 0 is the associated regularization weight.Note that J S is bounded below by 0. Given the aforementioned preparation, a non-cooperative infinite evolutionary differential game involving mAbs and the signalling pathways can be formulated within the framework of the calculus of variations [15] as follows: Solving (3.2) will result in the most effective responses of A to counter emerging resistance from S.Moreover, solving of (3.3) would result in the optimal responses of S in relation to the administered treatments.We consider the following best-response maps (see figure 1). (3.5) Accurately assessing the players' knowledge levels at any given moment is imperative for a comprehensive understanding of the game.The availability of information significantly impacts a player's decision-making process.Hence, the results of the game may exhibit significant variability contingent upon how the treatment is administered.In the following subsections, we will analyse the potential outcomes of the game (AS).

Stackelberg equilibrium
In the game (AS), the administration of therapy by mAbs serves as the initial action.Subsequently, the signalling pathways respond by generating countermeasures through the process of evolving their resistance.Although resistance-related molecular machinery may have been present before treatment commenced, it is possible that it did not undergo selective pressure in the form of a resistance mechanism until treatment was initiated [34].Consequently, the therapy of signalling pathways can be conceptualized as a strategic interaction between a leader and a follower.The investigation of leader-follower dynamics was first conducted by von Stackelberg [13], revealing notable benefits for the leader.The mAbs' initial intervention as a leader, leveraging their ability to anticipate the subsequent responses of the signalling pathways, offers a pivotal chance to attain more advantageous outcomes through the strategic guidance and limitation of resistance mechanisms employed by these signalling pathways.A Stackelberg equilibrium (SE) necessitates that player A strategically determines its optimal outcome by considering the best-response curve of player S (see figure 1).
More precisely, x * , u p * , u b * , u c * ∈ H 1 0, T f 4 × Ω 1 × Ω 2 × R S is a SE of the game (AS) if the following conditions hold:

Nash equilibrium
If the monoclonal antibodies cannot take advantage of the asymmetry in the game (AS) by taking the initiative, mAbs will lose the ability to do both predictive and directive work.As a result, the mAbs employ a consistent approach by repetitively administering drugs at maximum doses, even though the signalling pathways continually develop effective adaptive reactions [26].Moreover, by adopting a treatment approach that is solely based on modifying the treatment following the progression of the signalling pathways, mAbs effectively surrender control to these pathways, consequently heightening the probability of treatment ineffectiveness.The signalling pathways and mAbs employ strategies that demonstrate progression along their respective best-response curves as they engage in an iterative process of moves and countermoves.The given situation results in the formation of a Nash equilibrium (NE), which is identified by the point where the two curves intersect (see figure 1).Neither the signalling pathways nor the mAbs can make strategic changes that would benefit them individually in the context of NE. royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240347 From a mathematical perspective, x ¯, u ¯p, u ¯b, u ¯c ∈ H 1 0, T f if the following conditions hold: (3.7) x ¯, u ¯p, u ¯b, u ¯c = arg min J S x, u ¯p, u ¯b, u c , where u ¯p, u ¯b = R A u ¯c ; and u ¯c = R S u ¯p, u ¯b .Let Ω = Ω 1 × Ω 2 × Ω 3 , and define the controls-to-state map and consider the following reduced functionals: (3.9) Then, (u p , u b , u c ) is a NE for the game (AS) if the following holds: When it comes to achieving NE, the treatment of single pathways presents a significant challenge for mAbs.However, success can be achieved through the use of the right strategy.Using the game's inherent asymmetry, mAbs could improve patient outcomes and reduce side effects without resorting to unsafely high doses of Pembrolizumab or Brentuximab.

Theory of the evolutionary differential game
We provide a theoretical analysis of the NE and SE for the differential game (AS).In [35][36][37][38], there are analogous findings for other NE and SE differential games and optimal control problems.First, we prove that the ODE system (2.5) has positive solutions.
Lemma 4.1.The solution x of (2.5) is non-negative if the initial condition x 0 is non-negative for all Proof.We can write (2.5) as follows: If x, u ≥ 0, we obtain R, M ≥ 0, componentwise.By multiplying both sides of (4.1) by the integrating factor vector I = exp M x, u dt , we obtain that This gives us Since, x 0 ≥ 0, we have Ix 0 ≥ 0. Thus, (4.2) gives us that Ix t ≥ 0 for all t ∈ [0, T f ].Since, I > 0, we have that x t ≥ 0 for all t ∈ [0, T f ]. ∎ We next show some stability estimates for the solution of (2.5).Lemma 4.2.A solution x of ( 2.5) satisfies the following stability estimate: where Proof.From (2.5), we note the following: A simple application of Gronwall's inequality gives the desired result.∎ Lemma 4.2 gives us that a solution x of (2.5) is bounded.We now state and prove the existence and uniqueness of solutions of (2.5).
Proof.Since u ∈ Ω, u is bounded.From lemma 4.2, we have that x is bounded.Let T , then we also compute the following gradients: , 0, 0 , We have ‖∇ x f i x ‖ ∞ , i = 1, 2, 3 is bounded.Thus f is Lipschitz.Therefore the following conditions are satisfied by f: (i) f is continuous with respect to x.
(ii) f is measurable with respect to t.
(iii) f is bounded.
(iv) The derivative of f with respect to x is also bounded.
Thus, f satisfies the Caratheodory's conditions, and so there is a unique solution x ∈ H 1 0, T f 4 of (2.5).∎ We now have some properties of the cost functionals J A , J S given in (3.2) and (3.3).Similar arguments can be found in [39,40].
Proof.We will address the properties of J S and similar arguments will also hold for J A .The following are the steps to proving the properties of J S : 1. J S is bounded below by 0 and is coercive since To see this, let u c n ∈ Ω 3 such that u c n L 2 ≤ D 3 for all n, so it is bounded.Thus, royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240347 there is a subsequence u c n k such that u c n k ⇀ u c ∈ Ω 3 .Also, we have that Next we consider the sets Since J S is continuous, the sets U α are closed for all α ∈ ℝ.Thus, U α is a closed subset of a weakly sequentially compact space Ω 3 and is weakly sequentially closed for all α ∈ ℝ .This implies that J S is weakly sequentially lower semi-continuous.
Hence, Ω 3 is closed.4. The Frechet differential of the operator J S at u c is the bounded linear operator To show that A is the Frechet differential of J S at u c , we have the following: Also, A(ℎ) is a linear operator by the linearity of the integral.To see that A(ℎ) is bounded, we use Hölder inequality, Thus, royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240347 Therefore, A is the Frechet differential of the operator J S at u c .In a similar way, one can show that J A satisfies the properties in proposition 1. ∎ Owing to the objective functionals in (3.2) and (3.3) being non-convex, Nash's theorem cannot be used to prove the existence of a NE.The following theorem establishes that a NE of our game (AS) is a solution to a specified control problem.We then demonstrate that an optimal solution exists for this control problem.
The composite cost functional is defined as follows: (4.4) J ^up , u b , u c = J ^A u p , u b , u c + J ^S u p , u b , u c , and we consider the optimal control problem (4.5) arg min J ^up , u b , u c , u p , u b , u c ∈ Ω .Theorem 4.2.If there is a minimizer (u p , u b , u c ) of (4.5), then (u p , u b , u c ) is a NE of the game (AS).
Proof.We have that J (u p , u b , u c ) ≤ J (u p , u b , u c ) for all (u p , u b , u c ) ∈ Ω.Thus, J ^A u ¯p, u ¯b, u ¯c + J ^S u ¯p, u ¯b, u ¯c ≤ J ^A u p , u b , u c J ^S u p , u b c .Let (u p , u b , u c ) = (u p , u b , u c ) J S and u c = u c in J A .Then, we obtain Likewise, we can get Thus, the requirements of definition (3.10) have been satisfied.
With this preparation, we are now ready to state and prove the main results of the existence of Nash and Stackelberg equilibria.The proofs use similar arguments given in [35][36][37][38].∎ Theorem 4.3.Let J be given as in (4.4).Then, there exists a pair x ¯, u ¯∈ H 1 0, T f 4 × Ω such that x is a solution of (2.5), and u ¯ minimize J in Ω.
Proof.We define a map Λ: Ω → H 1 0, T f 4 by Λ u = x.By theorem 4.1 and lemma 4.2, we have that Λ is weakly sequential continuous.Since J is bounded from below, there exist minimizing sequences , where x k is the corresponding sequence of states.Since J is coercive, and Λ is bounded, we have that x k , u k is bounded.By using Eberlein-Šmulian theorem, there are weakly convergent subsequences u k l and x k l such that Since the compact embedding H 1 (0, T f ) ⊂ ⊂ C(0, T f ), the Rellich-Kondrachov theorem implies that Now, we need to verify that u ¯, and x ¯ satisfy Λ u ¯= x ¯.Let ϕ ∈ H 1 (0, T f ) be a test function that is compactly supported.Then, since Λ is bounded and the variable state x is bounded by lemma 4.2, we can apply the dominated convergence theorem. 0 We have that J is sequentially weakly lower semi-continuous, hence 11 royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240347 which yields the desired result.∎ Theorem 4.4.Let J A , J S be given as in (3.2) and (3.3).Then game (AS) has a SE x * , u p * , u b * , u c * on x * is a solution of (2.5), and Proof.For proving existence of a minimizer of J S , given in (3.3), we can follow the same arguments in theorem 4.3, owing to the fact that Ω 3 is a closed subspace of a Hilbert space and J S is coercive in Ω 3 , which yields a convergent subsequence (u c m l ) of a minimizing sequence (u c m ) for J S .The compactness result yields strong convergence of a subsequence x m l in (H (0, T )) 4 such that x m l = Λ u c m l .Then, we obtain the best-response R C curve of the signalling pathways.Once R C is obtained, we can prove the existence of SE by using the same arguments again.∎ The Frechét differentiability of J A , J S gives rise to the first order necessary optimality conditions as follows: for the minimization problem (3.2), the optimality system is given as (4.6) ) For the minimization problem (3.3), the optimality system is given as (4.9) ) for all u c ∈ Ω 3 .Here, ⋅ , ⋅ L 2 (0, T) is the standard 2 (0, T) inner product, which is defined as follows:

Numerical schemes for solving Nash and Stackelberg equilibria
In this section, we will present the numerical schemes used to determine the NE and SE of the game (AS).In order to address the nonlinear coupling among the strategies adopted by players, a relaxation scheme (e.g.[41]) is employed for the NE.The relaxation scheme is implemented in algorithm (5.1).
4. Compute ūb = arg min ubΩ2 J A (u s p , u b , u s c ).

5.
Compute ūc = arg min ucΩ3 J S (u s p , u s b , u c ).

(u
8. End while.To compute the SE, the following algorithm uses a sequential implementation of the relaxation method.To achieve efficacy, mAbs must possess the ability to anticipate and predict the optimal response of signalling pathways to their initial therapeutic intervention.The resolution of the optimization problem linked to the signalling pathways engenders anticipation.By solving the optimization problem of the mAbs using the optimal responses of signalling pathways as substitutes, the mAbs can determine the most effective strategies to employ.Using optimal doses of Pembrolizumab as strategies for mAbs will stimulate T cells to attack the signalling pathways.The mAbs will have alternative optimal strategies, using optimal Brentuximab doses to deliver chemotherapy in case the optimal Pembrolizumab doses are not effective enough to destroy the signalling pathways entirely [42].
Remark 5.1.The difference between the aforementioned algorithms for computing the Nash and Stackelberg equilibria are that in the case of Nash equilibria, the three strategies (u p , u b , u c ) are updated simultaneously using the previous iterate values, whereas in the case of Stackelberg equilibria, u p is updated first, followed by u c using the current value of u p , followed by u b using the current values of u p , u c .In essence, one can think of the Nash algorithm as a Gauss-Jacobi iterative method while the Stackelberg algorithm as a Gauss-Seidel iterative method.
In the algorithms mentioned above, it is necessary to select a relaxation factor, denoted as σ, that is sufficiently small to ensure convergence.The convergence of algorithm (5.1) can be proved using analogous arguments in [41].We provide a sketch of the proof of convergence of algorithm (5.2).For this purpose, we define a map denotes the adjoint variables, which are the solutions to the corresponding adjoint equations (4.7) and (4.10).By demonstrating the boundedness and Lipschitz properties of the adjoint variables using Gronwall's inequality, similar to lemma 4.2, we can conclude that the map N is both bounded and Lipschitz.For the sake of clarity, The optimality systems for the minimization problems (3.2) and (3.3) are given to be uniquely solvable, and we now consider B u p * , u b * , u c * to be the largest closed ball of Ω cen- tred at a SE (u p *, u b *, u c *) for the game (AS).We define a map A: B u p to the optimization variables.The reduced functional corresponding to either of the minimization problems is denoted generically by J ^o, and the associated optimization variable as u.Starting with the initial guess u 0 , we compute the first descent direction as where ∇J ^o is given by (4.8) or (4.11).The search directions are then obtained recursively as (5.1) where g k = ∇J ^uk , k = 0, 1, … and the parameter β k is chosen according to the formula of Hager-Zhang [43] given by (5.2) where y k = g k + 1 − g k .Next, a conjugate gradient descent step is used to compute the new optimization variable iterate where k is an index of the iteration step and α k > 0 is a step length obtained using a line search algorithm.For this line search, we use the following Armijo condition of sufficient decrease of J (5.4) where 0 < δ < 1/2 and the scalar product u, v L 2 represents the standard L 2 ([0, T]) inner product for the minimization problems (3.2) and (3.3).The gradient update step is finally combined with the following projection step to ensure that the iterates stay in the sets.
(5.5) u + 1 = P U u k + α k d k , where P U = max 0, N i , u i , ∀i = 1, …, s with U = Ω 1 , Ω 2 or Ω 3 , and N i = Z i , G i or r i , corresponding to the minimization problems in the above algorithms,.The projected NCG scheme can be summarized in the following algorithm: 1. Input: initial approx.u 0 .Evaluate d 0 = − u J ˆo(u 0 ), index k = 0, k = k max , tol.

While
, where α k is obtained using a line-search algorithm.

Numerical results
We present the numerical results of the NE and SE for the differential game (AS).For this purpose, we choose our non-dimensionalized scaling parameters as q 1 = 10 −2 , q 2 = 10 −1 , q 3 = 10 −6 , q 4 = 1, q 5 = 0.5, q 6 = q 8 = 1 and q 7 = 4.With the original time interval as [0, 200] days, this transformation yields the final time T f = 100.In the following two cases, the 15 royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240347 parameter values in the functionals J A , J S are given as τ = 1.8328,Z = 1.425,G = 0.98039, r = 0.01.We choose the values of the weights in the functional J A as μ = 1, η = 0.5, and the weight in the functional J S as ν = 0.5.For the test case 1, the patient data are generated as follows: we first simulate the following reduced ODE system for  ) is obtained by excluding treatments u p , u b , and the evolutionary resistance u c from (2.1).We have the term u b /(k + bu c ) in (2.1), where the parameters k, b are given as k = 0.1, b = 5 (see [34]).We also provide initial for the remaining variables as H 0 = 0.1, R 0 = 0.105, α = 0.055.The non-dimensionalized initial guess for NE is given as (u p 0 , u b 0 , u c 0 ) = (0, 0, 0).The assumption is motivated by the fact that all of the players are actively engaged in the game at the same time.Conversely, within the context of the Stackelberg scenario, player A assumes the role of the leader and initiates the first action by administering Pembrolizumab, subsequently prompting player S to respond.If this strategy fails to effectively manage the developing resistance u c , an alternative approach involving the administration of Brentuximab will be implemented by A. Based on this, we choose (u p 0 , u b 0 , u c 0 ) = (0.5, 0, 0) as our non-dimensionalized first guess for SE.We first solve the game (3.4) for NE and SE using (5.1) and (5.2), respectively.The illustrating plots of the NE and SE are shown in figures 2 and 3, respectively.Pembrolizumab and Brentuximab are administered at the MTD, and the use of Brentuximab is repeated many times at the same amount as shown in figure 2. We also observe that u b transports chemotherapy at high doses and the same amount multiple times.With treatment, L and C go down, but the players play the game simultaneously, and the mAbs cannot predict how the signalling pathways will act, so they cannot adjust their treatment strategies appropriately.This leads to the Nash game, in which the mAbs give up control of the game to the signalling pathways.Therefore, L and C continue to spread again because it is a consequence of the successful adaptations that the signalling pathways have made by evolving their resistance.
As shown in figure 3, the regulation or control of L and C is achieved through the responses of the monoclonal antibodies to the evolutionary resistance of signalling pathways.There is an imbalance in this game that makes it impossible for the signalling pathways to predict or adapt to treatments that In figure 4, It can be seen that L has strictly decreased as a result of the reason that this test case represents a patient with a strengthened immune system compared with the previous test case.Because of the same reason, the chemotherapy dosage and its transporter u b are lower in this case compared with the test case 1.However, the resurgence of C has been aided by the failure to eliminate evolutionary resistance after t = 90 owing to the lack of utilization of asymmetry in the game.
The signalling pathways L and C have decreased owing to the monoclonal antibody's response to their evolutionary resistance; however, the mAbs will continue to apply the most effective treatment strategies to eliminate any remaining resistance and stop its growth once more, as we see in figure 5. u p activates T cells at lower levels compared with the test case 1 because this test case represents a patient with a strengthened immune system compared with the previous test case.We observe that u b exhibits a comparable pattern of low concentration based on the level of cancer resistance u c at different time points.
The results of the two cases suggest that dynamic therapy designs that explicitly account for the evolutionary dynamics of resistance could replace the current treatment protocols that apply the drugs at MTD to take advantage of signalling pathway asymmetries.Our computational findings imply that the therapy of signalling pathways is analogous to a Stackelberg game, where monoclonal antibodies influence resistance evolution and total signalling pathway load.Therefore, compared with Nash equilibrium, Stackelberg equilibrium yields superior outcomes.

Conclusions
In this paper, we proposed a new framework in which a non-cooperative evolutionary differential game is formulated between mAbs and signalling pathways in OC.For this purpose, we employed a novel evolutionary mathematical model to simulate the dynamics of signalling pathways, incorporating the phenomenon of resistance evolution.We then solved a differential game to obtain the NE and SE.The relaxation scheme and a sequential version of the relaxation scheme were used to compute NE and SE, respectively.Based on our numerical experiments, the mAbs should prioritize SE over NE in order to improve OC patient outcomes.
Ethics.This work did not require ethical approval from a human subject or animal welfare committee.Data accessibility.The parameter values and data used to generate the results are already present in the paper.The corresponding codes to replicate the results are available via Dryad [44].
Declaration of AI use.We have not used AI-assisted technologies in creating this article.Authors' contributions.M.A.: conceptualization, data curation, formal analysis, investigation, methodology, project

-−a 3
αu p LT-Pembrolizumab stimulates T ^ cells, causing the death of EGF ligand L, b 1 (α − 1 2 C)L-the gain rate proportional to RL makes up the rate of change of C, -− 1 2 b 2 (H 0 + C)C-the rate of change of C is made up of a loss rate proportional to HC, -1 2 b 3 (H 0 − C)-the gain rate proportional to EGF : EGFR : HER2 complex makes up the rate of change of C, transports chemotherapy, preventing complex C formation in the presence of evolutionary resistance, -−d ^1 L ^+ C ^T ^-death of cells owing to the signalling pathways interactions, d 2 u p T-the amount of Pembrolizumab injection needed for activating T cells, -−d 3 ^M ^-the excretion and elimination of chemotherapy toxicity, d 4 u b -the amount of Brentuximab injected.

Figure 1 .
Figure 1.A visual representation interprets potential outcomes arising from the interplay of signall properties of the cost funcing pathways and mAbs.

3 .
Let u c n ∈ Ω 3 such that u c n u c ∈ L 2 ([0, T f ], ℝ).Let ϵ > 0, then, by the reverse Minkowski inequality, u c n − u c < u c n − u c < ϵ.Thus, u c n u c , and since u c n ∈ Ω 3 , u c n ≤ D 3 , and so,